AITopics | learning dag

Learning DAGs from Data with Few Root Causes

Neural Information Processing SystemsDec-24-2025, 14:30:55 GMT

We present a novel perspective and algorithm for learning directed acyclic graphs (DAGs) from data generated by a linear structural equation model (SEM). First, we show that a linear SEM can be viewed as a linear transform that, in prior work, computes the data from a dense input vector of random valued root causes (as we will call them) associated with the nodes. Instead, we consider the case of (approximately) few root causes and also introduce noise in the measurement of the data. Intuitively, this means that the DAG data is produced by few data generating events whose effect percolates through the DAG. We prove identifiability in this new setting and show that the true DAG is the global minimizer of the $L^0$-norm of the vector of root causes. For data satisfying the few root causes assumption, we show superior performance compared to prior DAG learning methods.

electronic proceedings, learning dag, name change, (2 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.41)

Add feedback

DAGMA: Learning DAGs via M-matrices and a Log-Determinant Acyclicity Characterization

Neural Information Processing SystemsDec-24-2025, 00:47:59 GMT

The combinatorial problem of learning directed acyclic graphs (DAGs) from data was recently framed as a purely continuous optimization problem by leveraging a differentiable acyclicity characterization of DAGs based on the trace of a matrix exponential function. Existing acyclicity characterizations are based on the idea that powers of an adjacency matrix contain information about walks and cycles. In this work, we propose a new acyclicity characterization based on the log-determinant (log-det) function, which leverages the nilpotency property of DAGs. To deal with the inherent asymmetries of a DAG, we relate the domain of our log-det characterization to the set of $\textit{M-matrices}$, which is a key difference to the classical log-det function defined over the cone of positive definite matrices.Similar to acyclicity functions previously proposed, our characterization is also exact and differentiable. However, when compared to existing characterizations, our log-det function: (1) Is better at detecting large cycles; (2) Has better-behaved gradients; and (3) Its runtime is in practice about an order of magnitude faster. From the optimization side, we drop the typically used augmented Lagrangian scheme and propose DAGMA ($\textit{Directed Acyclic Graphs via M-matrices for Acyclicity}$), a method that resembles the central path for barrier methods. Each point in the central path of DAGMA is a solution to an unconstrained problem regularized by our log-det function, then we show that at the limit of the central path the solution is guaranteed to be a DAG. Finally, we provide extensive experiments for $\textit{linear}$ and $\textit{nonlinear}$ SEMs and show that our approach can reach large speed-ups and smaller structural Hamming distances against state-of-the-art methods.

characterization, learning dag, log-determinant acyclicity characterization, (8 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.58)
Information Technology > Artificial Intelligence > Machine Learning (0.38)

Add feedback

DAGMA: Learning DAGs via M-matrices and a Log-Determinant Acyclicity Characterization

Neural Information Processing SystemsMay-26-2025, 22:54:19 GMT

The combinatorial problem of learning directed acyclic graphs (DAGs) from data was recently framed as a purely continuous optimization problem by leveraging a differentiable acyclicity characterization of DAGs based on the trace of a matrix exponential function. Existing acyclicity characterizations are based on the idea that powers of an adjacency matrix contain information about walks and cycles. In this work, we propose a new acyclicity characterization based on the log-determinant (log-det) function, which leverages the nilpotency property of DAGs. To deal with the inherent asymmetries of a DAG, we relate the domain of our log-det characterization to the set of \textit{M-matrices}, which is a key difference to the classical log-det function defined over the cone of positive definite matrices.Similar to acyclicity functions previously proposed, our characterization is also exact and differentiable. However, when compared to existing characterizations, our log-det function: (1) Is better at detecting large cycles; (2) Has better-behaved gradients; and (3) Its runtime is in practice about an order of magnitude faster.

artificial intelligence, machine learning, optimization problem, (8 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.60)
Information Technology > Artificial Intelligence > Machine Learning (0.40)

Add feedback

Learning DAGs and Root Causes from Time-Series Data

Misiakos, Panagiotis, Püschel, Markus

arXiv.org Machine LearningJan-6-2025

Many applications produce time-series data: multi-dimensional data measured in regular time steps. Examples include temperature measurements at different sites in meteorology [Yang et al., 2022], stock prices in finance [Kleinberg, 2013, Jiang and Shimizu, 2023], and brain data in medicine [Smith et al., 2011]. A key problem in analyzing time-series data is causal structure discovery, which aims to understand the generation mechanism of such data between nodes and across time [Assaad et al., 2022b, Runge et al., 2023, Gong et al., 2023, Hasan et al., 2023]. On common structural model associates time-series data with directed acyclic graphs (DAGs) that encode how the data in one time step is obtained from prior ones. Our work specifically focuses on learning these DAGs from time-series data [Sun et al., 2023, Gao et al., 2022, Pamfil et al., 2020]. This approach simplifies the broader problem of causal discovery by abstracting away the need for true causal relationships, which often require techniques like interventions. Despite this simplification, DAG learning from time series still poses a challenge due to the complexity of temporal dependencies and the high dimensionality of data.

artificial intelligence, experiment, machine learning, (16 more...)

arXiv.org Machine Learning

2501.0313

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
North America > Canada > Nova Scotia > Halifax Regional Municipality > Halifax (0.04)
(3 more...)

Genre: Research Report (1.00)

Industry:

Health & Medicine (1.00)
Banking & Finance > Trading (1.00)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback

DAGMA: Learning DAGs via M-matrices and a Log-Determinant Acyclicity Characterization

Neural Information Processing SystemsOct-10-2024, 15:25:16 GMT

The combinatorial problem of learning directed acyclic graphs (DAGs) from data was recently framed as a purely continuous optimization problem by leveraging a differentiable acyclicity characterization of DAGs based on the trace of a matrix exponential function. Existing acyclicity characterizations are based on the idea that powers of an adjacency matrix contain information about walks and cycles. In this work, we propose a new acyclicity characterization based on the log-determinant (log-det) function, which leverages the nilpotency property of DAGs. To deal with the inherent asymmetries of a DAG, we relate the domain of our log-det characterization to the set of \textit{M-matrices}, which is a key difference to the classical log-det function defined over the cone of positive definite matrices.Similar to acyclicity functions previously proposed, our characterization is also exact and differentiable. However, when compared to existing characterizations, our log-det function: (1) Is better at detecting large cycles; (2) Has better-behaved gradients; and (3) Its runtime is in practice about an order of magnitude faster.

characterization, log-det function, log-determinant acyclicity characterization, (5 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.60)
Information Technology > Artificial Intelligence > Machine Learning (0.40)

Add feedback

Learning DAGs from Data with Few Root Causes

Misiakos, Panagiotis, Wendler, Chris, Püschel, Markus

arXiv.org Artificial IntelligenceMay-25-2023

We present a novel perspective and algorithm for learning directed acyclic graphs (DAGs) from data generated by a linear structural equation model (SEM). First, we show that a linear SEM can be viewed as a linear transform that, in prior work, computes the data from a dense input vector of random valued root causes (as we will call them) associated with the nodes. Instead, we consider the case of (approximately) few root causes and also introduce noise in the measurement of the data. Intuitively, this means that the DAG data is produced by few data-generating events whose effect percolates through the DAG. We prove identifiability in this new setting and show that the true DAG is the global minimizer of the $L^0$-norm of the vector of root causes. For data with few root causes, with and without noise, we show superior performance compared to prior DAG learning methods.

artificial intelligence, dag, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2305.15936

Country: Europe > Switzerland > Zürich > Zürich (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.67)

Add feedback

DAGMA: Learning DAGs via M-matrices and a Log-Determinant Acyclicity Characterization

Bello, Kevin, Aragam, Bryon, Ravikumar, Pradeep

arXiv.org Artificial IntelligenceJan-15-2023

The combinatorial problem of learning directed acyclic graphs (DAGs) from data was recently framed as a purely continuous optimization problem by leveraging a differentiable acyclicity characterization of DAGs based on the trace of a matrix exponential function. Existing acyclicity characterizations are based on the idea that powers of an adjacency matrix contain information about walks and cycles. In this work, we propose a new acyclicity characterization based on the log-determinant (log-det) function, which leverages the nilpotency property of DAGs. To deal with the inherent asymmetries of a DAG, we relate the domain of our log-det characterization to the set of $\textit{M-matrices}$, which is a key difference to the classical log-det function defined over the cone of positive definite matrices. Similar to acyclicity functions previously proposed, our characterization is also exact and differentiable. However, when compared to existing characterizations, our log-det function: (1) Is better at detecting large cycles; (2) Has better-behaved gradients; and (3) Its runtime is in practice about an order of magnitude faster. From the optimization side, we drop the typically used augmented Lagrangian scheme and propose DAGMA ($\textit{DAGs via M-matrices for Acyclicity}$), a method that resembles the central path for barrier methods. Each point in the central path of DAGMA is a solution to an unconstrained problem regularized by our log-det function, then we show that at the limit of the central path the solution is guaranteed to be a DAG. Finally, we provide extensive experiments for $\textit{linear}$ and $\textit{nonlinear}$ SEMs and show that our approach can reach large speed-ups and smaller structural Hamming distances against state-of-the-art methods. Code implementing the proposed method is open-source and publicly available at https://github.com/kevinsbello/dagma.

artificial intelligence, log-determinant acyclicity characterization, optimization problem, (2 more...)

arXiv.org Artificial Intelligence

2209.08037

Genre: Research Report (0.69)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.53)

Add feedback

Learning DAGs with continuous optimization

AIHubApr-27-2020, 11:23:01 GMT

As datasets continually increase in size and complexity, our ability to uncover meaningful insights from unstructured and unlabeled data is crucial. At the same time, a premium has been placed on delivering simple, human-interpretable, and trustworthy inferential models of data. One promising class of such models are graphical models, which have been used to extract relational information from massive datasets arising from a wide variety of domains including biology, medicine, business, and finance, just to name a few. Graphical models are families of multivariate distributions with compact representations expressed as graphs. In both undirected (Markov networks) and directed (Bayesian networks) graphical models, the graph structure guides the factorization of the joint distribution into smaller local specifications such as clique potentials or local conditionals of a variable given its "parent" variables.

continuous optimization, graph, graphical model, (16 more...)

AIHub

Industry: Health & Medicine (0.49)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.52)

Add feedback

Filters

Collaborating Authors

learning dag

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

Learning DAGs from Data with Few Root Causes

DAGMA: Learning DAGs via M-matrices and a Log-Determinant Acyclicity Characterization

DAGMA: Learning DAGs via M-matrices and a Log-Determinant Acyclicity Characterization

Learning DAGs and Root Causes from Time-Series Data

DAGMA: Learning DAGs via M-matrices and a Log-Determinant Acyclicity Characterization

Learning DAGs from Data with Few Root Causes

DAGMA: Learning DAGs via M-matrices and a Log-Determinant Acyclicity Characterization

Learning DAGs with continuous optimization